melt() and dcast() are useful functions when you're trying to reshape you're data, going from wide to long or vice versa. It's part of the reshape package.
For wide to long:
library(reshape)
wide <-read.csv("IENFD.csv")# ID IENFD.DL IENFD.DT IENFD.PT Sex#1 1 2.10 5.42 11.70 M#2 2 6.63 7.94 8.58 M#3 3 9.56 14.21 9.10 F#4 4 6.18 6.46 5.52 F#with melt you provide the ID variables (i.e., those that were not measured) and it assumes everything else was measured.
long <- melt(wide, id.vars=c("ID", "Sex"))#the output is a new dataframe with fewer columns and more rows#in this case, "ID" and "Sex" were kept as the index variables, and R created new columns "variable" and "value"# ID Sex variable value#1 1 M IENFD.DL 2.10#2 2 M IENFD.DL 6.63#3 3 F IENFD.DL 9.56#4 4 F IENFD.DL 6.18#5 1 M IENFD.DT 5.42#6 2 M IENFD.DT 7.94#7 3 F IENFD.DT 14.21#8 4 F IENFD.DT 6.46#9 1 M IENFD.PT 11.70#10 2 M IENFD.PT 8.58#11 3 F IENFD.PT 9.10#12 4 F IENFD.PT 5.52#the "variable" column is the ID of your repeated measure, and the "value" is the corresponding measures#you can rename the columns with the code:colnames(long)[c(3,4)]<-c("Location", "IENFD")# ID Sex variable value#1 1 M IENFD.DL 2.10#2 2 M IENFD.DL 6.63#3 3 F IENFD.DL 9.56#You can also clean up the "variable" names by changing the labels on the levels:levels(long$Location)[1]"IENFD.DL""IENFD.DT""IENFD.PT"levels(long$Location)<-c("DL", "DT", "PT")# ID Sex Location IENFD#1 1 M DL 2.10#2 2 M DL 6.63#3 3 F DL 9.56#4 4 F DL 6.18#5 1 M DT 5.42#6 2 M DT 7.94#7 3 F DT 14.21#8 4 F DT 6.46#9 1 M PT 11.70#10 2 M PT 8.58#11 3 F PT 9.10#12 4 F PT 5.52
model.matrix() When you have factors as predictor variables, it can be very useful to generate a design matrix that specifies which effects (beta parameters) should come into play for each data point. The design matrix is essentially your data, re-formatted as a matrix of zeros and ones specifying the categorical attributes of each data point.In R, design matrices are easily generated using the model.matrix function.
For example, say you had a large dataset with a categorical predictor variable with 10 levels, for which it would otherwise be tedious to assemble a design matrix (e.g., 10 different species).
The simple R command x <- model.matrix(~species) will automatically generate the design matrix for modeling the effect of species with an intercept term (use model.matrix(~species-1) if you don't want an intercept term). If you had 2 categorical variables you wanted to include as predictor variables (e.g., species and habitat type), you could generate a design matrix for a model including main and interaction effects with the R command x <- model.matrix(~species*habtype).
For the seed predation example from Bolker's chapter 8, I used model.matrix to specify a model for the effect of species on seed predation rates as follows:
tapply() is very useful for applying a function (e.g., mean, max, length, or one of your own) to the observations in different groups or levels
You can find a nice example of how to use tapply() and other variants of apply() here: http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/
melt() and dcast() are useful functions when you're trying to reshape you're data, going from wide to long or vice versa. It's part of the reshape package.
For wide to long:
For more help with reshaping data, go here: http://wiki.stdout.org/rcookbook/Manipulating%20data/Converting%20data%20between%20wide%20and%20long%20format/
model.matrix() When you have factors as predictor variables, it can be very useful to generate a design matrix that specifies which effects (beta parameters) should come into play for each data point. The design matrix is essentially your data, re-formatted as a matrix of zeros and ones specifying the categorical attributes of each data point.In R, design matrices are easily generated using the model.matrix function.
For example, say you had a large dataset with a categorical predictor variable with 10 levels, for which it would otherwise be tedious to assemble a design matrix (e.g., 10 different species).
The simple R command x <- model.matrix(~species) will automatically generate the design matrix for modeling the effect of species with an intercept term (use model.matrix(~species-1) if you don't want an intercept term). If you had 2 categorical variables you wanted to include as predictor variables (e.g., species and habitat type), you could generate a design matrix for a model including main and interaction effects with the R command x <- model.matrix(~species*habtype).
For the seed predation example from Bolker's chapter 8, I used model.matrix to specify a model for the effect of species on seed predation rates as follows: